Search CORE

26 research outputs found

Taming Strings in Dynamic Languages - An Abstract Interpretation-based Static Analysis Approach

Author: ARCERI VINCENZO
Publication venue
Publication date: 01/01/2020
Field of study

In the recent years, dynamic languages such as JavaScript, Python or PHP, have found several fields of applications, thanks to the multiple features provided, the agility of deploying software and the seeming facility of learning such languages. In particular, strings play a central role in dynamic languages, as they can be implicitly converted to other type values, used to access object properties or transformed at run-time into executable code. In particular, the possibility to dynamically generate code as strings transformation breaks the typical assumption in static program analysis that the code is an immutable object, indeed static. This happens because program\u2019s essential data structures, such as the control-flow graph and the system of equation associated with the program to analyze, are themselves dynamically mutating objects. In a sentence: "You can\u2019t check the code you don\u2019t see". For all these reasons, dynamic languages still pone a big challenge for static program analysis, making it drastically hard and imprecise. The goal of this thesis is to tackle the problem of statically analyzing dynamic code by treating the code as any other data structure that can be statically analyzed, and by treating the static analyzer as any other function that can be recursively called. Since, in dynamically-generated code, the program code can be encoded as strings and then transformed into executable code, we first define a novel and suitable string abstraction, and the corresponding abstract semantics, able to both keep enough information to analyze string properties, in general, and keep enough information about the possible executable strings that may be converted to code. Such string abstraction will permits us to distill from a string abstract value the executable program expressed by it, allowing us to recursively call the static analyzer on the synthesized program. The final result of this thesis is an important first step towards a sound-by- construction abstract interpreter for real-world dynamic string manipulation languages, analyzing also string-to-code statements, that is the code that standard static analysis "can\u2019t see"

Catalogo dei prodotti della ricerca

Analyzing Dynamic Code: A Sound Abstract Interpreter for evil eval

Author: Arceri Vincenzo
Mastroeni Isabella
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 01/01/2021
Field of study

Dynamic languages, such as JavaScript, employ string-to-code primitives to turn dynamically generated text into executable code at run-time. These features make standard static analysis extremely hard if not impossible because its essential data structures, i.e., the control-flow graph and the system of recursive equations associated with the program to analyze, are themselves dynamically mutating objects. Nevertheless, assembling code at run-time by manipulating strings, such as by eval in JavaScript, has been always strongly discouraged since it is often recognized that \u201ceval is evil", leading static analyzers to not consider such statements or ignoring their effects. Unfortunately, the lack of formal approaches to analyze string-to-code statements pose a perfect habitat for malicious code, that is surely evil and do not respect good practice rules, allowing them to hide malicious intents as strings to be converted to code and making static analyses blind to the real malicious aim of the code. Hence, the need to handle string-to-code statements approximating what they can execute, and therefore allowing the analysis to continue (even in presence of dynamically generated program statements) with an acceptable degree of precision, should be clear. In order to reach this goal, we propose a static analysis allowing us to collect string values and to soundly over-approximate and analyze the code potentially executed by a string-to-code statement

Catalogo dei prodotti della ricerca

Static Program Analysis for String Manipulation Languages

Author: Arceri Vincenzo
Mastroeni Isabella
Publication venue: 'Open Publishing Association'
Publication date: 01/01/2019
Field of study

In recent years, dynamic languages, such as JavaScript or Python, have been increasingly used in a wide range of fields and applications. Their tricky and misunderstood behaviors pose a hard challenge for static analysis of these programming languages. A key aspect of any dynamic language program is the multiple usage of strings, since they can be implicitly converted to another type value, transformed by string-to-code primitives or used to access an object-property. Unfortunately, string analyses for dynamic languages still lack precision and do not take into account some important string features. Moreover, string obfuscation is very popular in the context of dynamic language malicious code, for example, to hide code information inside strings and then to dynamically transform strings into executable code. In this scenario, more precise string analyses become a necessity. This paper is placed in the context of static string analysis by abstract interpretation and proposes a new semantics for string analysis, placing a first step for handling dynamic languages string features.Comment: In Proceedings VPT 2019, arXiv:1908.0672

arXiv.org e-Print Archive

Catalogo dei prodotti della ricerca

Abstract Domains for Type Juggling

Author: Sergio Maffeis
Vincenzo Arceri
Publication venue
Publication date: 27/07/2016
Field of study

Web scripting languages, such as PHP and JavaScript, provide a wide range of dynamic features that make them both flexible and error-prone. In order to prevent bugs in web applications, there is a sore need for powerful static analysis tools. In this paper, we investigate how Abstract Interpretation may be leveraged to provide a precise value analysis providing rich typing information that can be a useful component for such tools. In particular, we define the formal semantics for a core of PHP that illustrates type juggling, the implicit type conversions typical of PHP, and investigate the design of abstract domains and operations that, while still scalable, are expressive enough to cope with type juggling. We believe that our approach can also be applied to other languages with implicit type conversions

Catalogo dei prodotti della ricerca

Spiral - Imperial College Digital Repository

Open Access Repository

Static Analysis for ECMAScript String Manipulation Programs

Author: Arceri Vincenzo
Mastroeni Isabella
Xu Sunyi
Publication venue: 'MDPI AG'
Publication date: 01/01/2020
Field of study

In recent years, dynamic languages, such as JavaScript or Python, have been increasingly used in a wide range of fields and applications. Their tricky and misunderstood behaviors pose a great challenge for static analysis of these languages. A key aspect of any dynamic language program is the multiple usage of strings, since they can be implicitly converted to another type value, transformed by string-to-code primitives or used to access an object-property. Unfortunately, string analyses for dynamic languages still lack precision and do not take into account some important string features. In this scenario, more precise string analyses become a necessity. The goal of this paper is to place a first step for precisely handling dynamic language string features. In particular, we propose a new abstract domain approximating strings as finite state automata and an abstract interpretation-based static analysis for the most common string manipulating operations provided by the ECMAScript specification. The proposed analysis comes with a prototype static analyzer implementation for an imperative string manipulating language, allowing us to show and evaluate the improved precision of the proposed analysis

Catalogo dei prodotti della ricerca

Static analysis for dummies: experiencing LiSA

Author: Arceri Vincenzo
Cortesi Agostino
Ferrara Pietro
Negrini Luca
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 01/01/2021
Field of study

Semantics-based static analysis requires a significant theoretical background before being able to design and implement a new analysis. Unfortunately, the development of even a toy static analyzer from scratch requires to implement an infrastructure (parser, control flow graphs representation, fixpoint algorithms, etc.) that is too demanding for bachelor and master students in computer science. This approach difficulty can condition the acquisition of skills on software verification which are of major importance for the design of secure systems. In this paper, we show how LiSA (Library for Static Analysis) can play a role in that respect. LiSA implements the basic infrastructure that allows a non-expert user to develop even simple analyses (e.g., dataflow and numerical non-relational domains) focusing only on the design of the appropriate representation of the property of interest and of the sound approximation of the program statements

Archivio istituzionale della ricerca - Università degli Studi di Venezia Ca' Foscari

Abstract Domains for Type Juggling

Author: Arceri Vincenzo
Maffeis Sergio
Publication venue
Publication date: 01/01/2017
Field of study

Catalogo dei prodotti della ricerca

Information Flow Analysis for Detecting Non-Determinism in Blockchain

Author: Agostino Cortesi
Fabio Tagliaferro
Fausto Spoto
Luca Negrini
Luca Olivieri
Pietro Ferrara
Vincenzo Arceri
Publication venue: Leibniz International Proceedings in Informatics (LIPIcs), Schloss Dagstuhl - Leibniz-Zentrum für Informatik
Publication date: 01/01/2023
Field of study

A mandatory feature for blockchain software, such as smart contracts and decentralized applications, is determinism. In fact, non-deterministic behaviors do not allow blockchain nodes to reach one common consensual state or a deterministic response, which causes the blockchain to be forked, stopped, or to deny services. While domain-specific languages are deterministic by design, general purpose languages widely used for the development of smart contracts such as Go, provide many sources of non-determinism. However, not all non-deterministic behaviours are critical. In fact, only those that affect the state or the response of the blockchain can cause problems, as other uses (for example, logging) are only observable by the node that executes the application and not by others. Therefore, some frameworks for blockchains, such as Hyperledger Fabric or Cosmos SDK, do not prohibit the use of non-deterministic constructs but leave the programmer the burden of ensuring that the blockchain application is deterministic. In this paper, we present a flow-based approach to detect non-deterministic vulnerabilities which could compromise the blockchain. The analysis is implemented in GoLiSA, a semantics-based static analyzer for Go applications. Our experimental results show that GoLiSA is able to detect all vulnerabilities related to non-determinism on a significant set of applications, with better results than other open-source analyzers for blockchain software written in Go

Archivio istituzionale della Ricerca - Università degli Studi di Parma

Archivio istituzionale della ricerca - Università degli Studi di Venezia Ca' Foscari

Dagstuhl Research Online Publication Server

Catalogo dei prodotti della ricerca

Information Flow Analysis for Detecting Non-Determinism in Blockchain (Artifact)

Author: Arceri Vincenzo
Cortesi Agostino
Ferrara Pietro
Negrini Luca
Olivieri Luca
Spoto Fausto
Tagliaferro Fabio
Publication venue: DARTS - Dagstuhl Artifacts Series. DARTS, Volume 9, Issue 2, Special Issue of the 37th European Conference on Object-Oriented Programming (ECOOP 2023)
Publication date: 01/01/2023
Field of study

A mandatory feature for blockchain software, such as smart contracts and decentralized applications, is determinism. In fact, non-deterministic behaviors do not allow blockchain nodes to reach one common consensual state or a deterministic response, which causes the blockchain to be forked, stopped, or to deny services. While domain-specific languages are deterministic by design, general-purpose languages widely used for the development of smart contracts such as Go, provide many sources of non-determinism. However, not all non-deterministic behaviours are critical. In fact, only those that affect the state or the response of the blockchain can cause problems, as other uses (for example, logging) are only observable by the node that executes the application and not by others. Therefore, some frameworks for blockchains, such as Hyperledger Fabric or Cosmos SDK, do not prohibit the use of non-deterministic constructs but leave the programmer the burden of ensuring that the blockchain application is deterministic. In this paper, we present a flow-based approach to detect non-deterministic vulnerabilities which could compromise the blockchain. The analysis is implemented in GoLiSA, a semantics-based static analyzer for Go applications. Our experimental results show that GoLiSA is able to detect all vulnerabilities related to non-determinism on a significant set of applications, with better results than other open-source analyzers for blockchain software written in Go

Archivio istituzionale della Ricerca - Università degli Studi di Parma

Dagstuhl Research Online Publication Server

SEA: String Executability Analysis by Abstract Interpretation

Author: Arceri Vincenzo
Dalla Preda Mila
Giacobazzi Roberto
Mastroeni Isabella
Publication venue: Dipartimento di Informatica - Universit\ue0 di Verona
Publication date: 01/01/2017
Field of study

Dynamic languages often employ reflection primitives to turn dynamically generated text into executable code at run-time. These features make stan- dard static analysis extremely hard if not impossible because its essential data structures, i.e., the control-flow graph and the system of recursive equa- tions associated with the program to analyse, are themselves dynamically mutating objects. We introduce SEA, an abstract interpreter for automatic sound string executability analysis of dynamic languages employing bounded (i.e, finitely nested) reflection and dynamic code generation. Strings are stat- ically approximated in an abstract domain of finite state automata with basic operations implemented as symbolic transducers. SEA combines standard program analysis together with string executability analysis. The analysis of a call to reflection determines a call to the same abstract interpreter over a code which is synthesised directly from the result of the static string exe- cutability analysis at that program point. The use of regular languages for approximating dynamically generated code structures allows SEA to soundly approximate safety properties of self modifying programs yet maintaining ef- ficiency. Soundness here means that the semantics of the code synthesised by the analyser to resolve reflection over-approximates the semantics of the code dynamically built at run-rime by the program at that point

arXiv.org e-Print Archive

Catalogo dei prodotti della ricerca